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Abstract 

We study the convergence time of the best response dynamics in player-specific singleton 
congestion games. It is well known that this dynamics can cycle, although from every state 
a short sequence of best responses to a Nash equilibrium exists. Thus, the random best 
response dynamics, which selects the next player to play a best response uniformly at random, 
terminates in a Nash equilibrium with probability one. In this paper, we are interested in the 
expected number of best responses until the random best response dynamics terminates. 

As a first step towards this goal, we consider games in which each player can choose between 
only two resources. These games have a natural representation as (multi-)graphs by identifying 
nodes with resources and edges with players. For the class of games that can be represented 
as trees, we show that the best-response dynamics cannot cycle and that it terminates after 
O(n^) steps where n denotes the number of resources. For the class of games represented as 
cycles, we show that the best response dynamics can cycle. However, we also show that the 
random best response dynamics terminates after O(w^) steps in expectation. 

Additionally, we conjecture that in general player-specific singleton congestion games there 
exists no polynomial upper bound on the expected number of steps until the random best 
response dynamics terminates. We support our conjecture by presenting a family of games for 
which simulations indicate a super-polynomial convergence time. 



* Parts of the results presented here already appeared in the Proceedings of the 4th Symposium on Stochastic 
Algorithms, Foundations, and Applications (SAGA) in 2007 ^1,. 



1 Introduction 



We study the convergence time of the best response dynamics to pure Nash equihbrial^ in player- 
specific singleton congestion games. In such games, we are given a set of resources and a set of 
players. Each player is equipped with a set of non-decreasing, player-specific delay functions which 
measure the delay the player experiences from allocating a particular resource and sharing it with 
a certain number of other players. A player's goal is to allocate a single resource with minimum 
delay given fixed choices of the other players. Milchtaich [12], who was the first to consider player- 
specific singleton congestion games, proves that every such game possesses a Nash equilibrium 
which can be computed efficiently. However, he also observes that these games are no potential 
games P3], that is, the best response dynamics, in which players consecutively change to resources 
with minimum delay, can cycle. This is in contrast to congestion games with common delay 
functions in which all players sharing a resource observe the same delay. In the following, we refer 
to congestion games with common delay functions as standard congestion games. Rosenthal [15j . 
who introduces standard congestion games, proves that they always admit a potential function 
guaranteeing the existence of Nash equilibria and that the best response dynamics cannot cycle, 
leong et al. [9 consider the convergence time of the best response dynamics to Nash equilibria in 
standard singleton congestion games. They observe that the delay values can be replaced by their 
ranks in the sorted list of theses values without affecting the best responses dynamics. By applying 
Rosenthal's potential functions to these new delay functions they observe that after at most n^m 
best responses a Nash equilibrium is reached, where n equals the number of players and m the 
number of resources. This result is independent of any assumption on the ordering according to 
which players change their strategies. 

Since the best response dynamics in player-specific singleton congestion games can cycle, we 
propose to study random best response dynamics in such games. This approach is motivated by 
the following observation due to Milchtaich [T^]: From every state of a player-specific singleton 
congestion game there exists a polynomially long sequence of best responses leading to a Nash 
equilibrium. Thus, the random best response dynamics selecting the next player to play a best 
response at random terminates with probability one after a finite number of steps. Milchtaich's 
analysis leaves open the question how long it takes until the random best response dynamics 
terminates in expectation. In this paper, we address this question as we think that it is a natural 
and interesting one. Currently, we are not able to analyze the convergence time in arbitrary 
player-specific singleton congestion games. However, our experimental results support the following 
conjecture. 

Conjecture 1. There exist player- specific singleton congestion games in which the expected number 
of steps until the random best response dynamics terminates is super-polynomial. 

In order to gain insights into the random best response dynamics, we begin with very simple 
yet interesting classes of games, and consider games in which each player chooses between only 
two alternatives. These games can be represented as multi-graphs: each resource corresponds to a 
node and each player to an edge. In the following, we call games that can be represented as graphs 
with topology t player-specific congestion games on topology t. First, we consider games on trees 
and circles. 

We prove that player-specific congestion games on trees admit a potential function from which 
we derive an upper bound of 0{n^) on the maximum number of best responses until a Nash 
equilibrium is reached. Note, that this result is independent of the initial state and any assumption 
on the ordering according to which players change their strategies. 

The result bases on the observation that one can replace the player-specific delay functions by 
common delay functions without changing the players preferences. Thus, player-specific congestion 
games on trees are isomorphic to standard congestion games on trees and we can apply the result 

^In the following, the term Nash equilibrium always refers to a pure one. 
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of leong et al. [9] to upper bound the convergence time. We proceed with player-specific congestion 
games on circles, and show that these games are the simplest games in which the best response 
dynamics can cycle. As we are only given four different delay values per player, we characterize 
with respect to the ordering of these four values in which cases the best response dynamics can 
cycle. We observe that only one such case exists. Finally, we analyze the convergence time of 
the random best response dynamics in such games, and prove a bound of 0{n^) on the expected 
number of steps until the dynamics terminates. In order to prove this result we introduce the 
notion of over- and overload tokens. An overload token indicates that a resource is shared by two 
players, an underload token indicates that it is unused. We observe that the number of tokens 
cannot increase, and that once in a while tokens get stuck or vanish. 

Based on the insights gained by analyzing player-specific congestion games on circles we present 
a family of games and conjecture that there exists no polynomial upper bound on the expected 
time until the random best response dynamics terminates. Obviously, this depends on the initial 
state, and so we implicitly assume that the initial configuration is chosen appropriately. Our 
conjecture is motivated by a slightly different notion of over- and underload tokens. Now their 
definition depends on the fact that every resource has a fixed congestion that it takes in every 
Nash equilibrium. In contrast to games on circles we show that the number of over- and underload 
tokens can also increase if the initial configuration is chosen appropriately. Intuitively one may 
think of the number of tokens as a measure of derangement of order. In games on circles this 
measure can only decrease whereas it can also increase in general games. We fail to give a rigorous 
proof of a super-polynomial lower bound. However, we support our conjecture by empirical results 
obtained from simulations. 

1.1 Definitions and Notations 

A player-specific singleton congestion game F is a tuple (A/", TZ, {Y^i)i^j\f^ ('^r)rlK) where M denotes 
the set of n players, TZ the set of m resources. Si C 7?. the strategy space of player i, and : N ^ N 
a strictly increasing delay function associated with player i and resource r. In the following, we 
assume that ties are broken arbitrarily. That is, for every pair of resources ri,r2 G and every 
pair nri,nr2 £ d\.^{nr^) ^ d\.^(nr2)- We denote by S = (ri, . . . ,r„) the state of the game in 
which player i allocates resource G Si. For a state 5*, we define the congestion nr{S) on resource 
r by nr{S) — \{i \ r = ri}\, that is, nr{S) equals the number of players sharing resource r in state 
S. We assume that players act selfishly and wish to allocate resources minimizing their individual 
delays. The delay of player i from allocating resource r in state S is given by dl.{nr{S)). Given a 
state S — (ri, . . . , r„), we call a resource r* G Si \ {ri} a best response of player i to S if, for all 
r' e Ei\{rJ, 4.(nr.(5) + l) < dj., K' (5') 1), and if (n^. (5) 1) < 4^(n^,(5)). Furthermore, 
we call ri a best response of player i to 5' if, for all r' G Si \ {ri}, d*. (71^.(5*)) < d^., {ur' {S) + 1). 
The standard solution concept in player-specific singleton congestion games are Nash equilibria. A 
state S' is a Nash equilibrium if for each player i the resource ri is a best response. 

In this paper, we consider games that have natural representations as graphs. We assume that 
each player chooses between only two resources. In this case, we can represent the resources as 
the nodes of a graph and the players as the edges. If different players choose between the same 
two resources, then the corresponding graph has multi-edges. The direction of an edge naturally 
corresponds to the strategy the player currently plays. 

In the following, we will sometime refer to standard singleton congestion games. Standard 
singleton congestion games are defined in the same way as player-specific singleton congestion 
games except that we are not given player-specific delay functions , r e 7?., i e TV, but common 
delay functions dr, r TZ. leong et al. [S] observe that in standard singleton congestion games one 
can always replace the delay values dr{nr) with r dTZ and 1 < Ur < n hy their ranks in the sorted 
list of these values without affecting the players preferences in any state of the game. Note that this 
approach is not restricted to standard singleton congestion games but also applies to player-specific 
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singleton congestion games. That is, given a player-specific congestion game F, fix a player i and 
consider a list of all delays d^(nr) with r ^ TZ and 1 < rir < n. Assume that this list is sorted in a 
non-decreasing order. For each resource r, we define an alternative player-specific delay function 
dl : N ^ N where, for each possible congestion n^, dKrir) equals the rank of the delay d^(nr) in 
the aforementioned list of all delays. Due to our assumptions on the delay functions, all ranks are 
unique. In the following, we define the type of a player i by the ordering of the player-specific 
delays dl{l), . . . , dl.{n) of the resources r g S^. 

We define the transition graph TG(T) of a player-specific singleton congestion game F as the 
graph that contains a vertex for every state of the game. Moreover, there is a directed edge labeled 
with i from state S to state S" if we obtain S' from S by permitting player i to play a best response 
in S. 

We call the dynamics in which players iteratively play best responses the best response dynamics. 
Furthermore, we use the term best response schedule to denote an algorithm that selects, given 
a state S, the next player to play a best response. We assume that such a player is always 
selected among those players who have an incentive to change their strategy. The convergence time 
t{n,m) of a best response schedule is the maximum number of steps to reach a Nash equilibrium 
in any game with n players and m resources and for any initial state. If the schedule selects 
the next player to play a best response uniformly at random then t{n, m) refers to the maximum 
expected convergence time. We use the term random best response dynamics to denote the resulting 
dynamics. Additionally, we use the terms best response dynamics and best response schedule 
interchangeably. 

1.2 Related Work 

We already mentioned that every player-specific singleton congestion game possesses a Nash equilib- 
rium which can be computed efficiently. Moreover, we mentioned that such games are no potential 
game, even though from every state there exists a polynomially long sequence of best responses 
leading to a Nash equilibrium. These results are due to Milchtaich 12J. Milchtaich also observes 
that player-specific network congestion games, i.e., games in which each player wants to allocate a 
path in a network, do not possess Nash equilibria in general jBj. He proposes to characterize those 
games with respect to their networks which always possess Nash equilibria. Such a characteriza- 
tion should be independent of further assumptions on the delay functions. Ackermann, Roglin, and 
Vocking [4j extend the results presented in [T^ to player-specific matroid congestion games, and 
prove that the matroid property is the maximal property with respect to the combinatorial struc- 
ture of the players' strategy spaces guaranteeing the existence of Nash equilibria. In such games, 
the players' strategy spaces are sets of bases of matroids over the resources. Gairing, Monien, and 
Tiemann [7] consider player-specific singleton congestion games with linear delay functions without 
offsets and prove among other results that such games are potential games. 

A model closely related to player-specific congestion games are standard congestion games. 
Rosenthal [T^ proves that these games are potential games. leong et al. [1] address the convergence 
time of the best responses dynamics in standard singleton congestion games. They show that the 
best response dynamics converges quickly. Fabrikant, Papadimitriou, and Talwar [6] show that in 
general standard congestion games players do not convergence quickly. Their result holds especially 
in the case of network congestion games, in which players seek to allocate paths in a network. Later, 
Ackermann, Roglin, and Vocking [3] extended the result of leong et al. [1] to matroid congestion 
games, and prove that the matroid property is the maximal property on the players' strategy 
spaces guaranteeing polynomial convergence time. Even-Dar et al. [5] consider the convergence 
time in standard singleton congestion games with weighted players. 

Another model which possesses similar properties as player-specific singleton congestion games 
are two-sided markets. In these games, we are given a set of resources and a set of players, and 
for every resource and every player a preference list of the elements of the other set. Given such a 
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game, one seeks for a stable matching assigning players to resources such that there exists no pair 
of player and resource that are not matched to each other but prefer each other to their current 
matches. Gale and Shapely [H] prove that stable matchings always exist. Knuth [TU] proposes to 
study better or best response dynamics in such games and observes that they can cycle. However, 
Roth and Vande Vate [Hj observe that short better response paths to stable matchings always 
exists. Ackermann et al. follow this line of research and prove an exponential lower bound on 
the expected time until the random better (best) response dynamics terminates. 

2 Player-specific Congestion Games on Trees 

In this section, we consider player-specific congestion games on trees. Note that in such games the 
number of resources equals the number of players. First, we observe that one can always replace 
the player-specific delay functions by common delay functions such that the players' types are 
preserved. Hence, we obtain a standard singleton congestion game, whose transition graph equals 
the transition graph of the player-specific game. We prove the following theorem. 

Theorem 2. In every player- specific congestion game on a tree with n nodes, every best response 
schedule terminates after at most 2n^ steps. 

Proof. Let F be a player-specific congestion game F on a tree. In the following, we describe how 
to replace the player-specific delay functions of F by common delay functions : N — > N, r g 7<^, 
with the following property: For every player i its type with respect to the player-specific delay 
functions equals its type with respect to the standard delay functions. Remember that the types 
completely describe the preferences of the players, and hence, the transition graph of F is not 
affected by replacing the player-specific delay functions by common ones. Since the resulting game 
is a standard singleton congestion game, F is a potential game and we can apply the result of leong 
et al. [5] to upper bound the convergence time. Obviously, the same bound holds in F. Thus, by 
applying the proof of the convergence time in standard singleton congestion game as presented 
in [3] , we conclude that every best response schedule for player-specific congestion games on trees 
terminates after at most 2n^ steps. 

We prove the theorem by induction on the number of players and describe how to construct a 
sequence of player-specific congestion games Fi, . . . , F„ on trees with the following properties. Fi 
is obtained from F by removing the players 2 to n from the game. The set of resources in Fq is the 
set of the two resources the first player is interested in. Now F^ is obtained from Fi_i by adding 
one player and one resource to F^. The player and the resource is chosen in such a way that F^ 
is a player-specific congestion game on a tree. That is, we choose a player i who is interested in 
resource r of Fi_i, and add the additional resource r' the player is interested in to F^. 

Obviously Fi, the player-specific congestion game with a single player and two resources, is a 
standard congestion game. As induction hypothesis assume, that we already replaced the player- 
specific delay functions in Fi_i by common ones without affecting the players' types. For ease of 
notation let be this game. In the following, we assume that for every resource r in F* its 
delay functions is defined for all possible congestion values between 1 and n and not only for 
the maximum number of players that are interested in r in F*. 

Now given T*_i, we describe how to choose the delay functions dr of the resources in F* such 
that the players in F* and F^ have the same types. The delay functions of the resources r that 
belong to are the same as in T*_^. Additionally, we assume that for every such resource r 
and every congestion 1 < < n, dr{nr) — dr{nr — I) > n. If this is not the case, then due to our 
assumption that the delay functions are strictly increasing, we can scale all delays by a factor of n 
in order to achieve the desired goal. Thus, it remains to choose a delay function of the additional 
resource r' that does not belong to F*_]^. Since the gap between consecutive values of the delay 
function dr is large enough, we can realize every type for the additional player by choosing the 
delay function dr' appropriately. 
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Applying the result from 3J to the game F* directly implies the theorem. 



□ 



3 Player-specific Congestion Games on Circles 

In this section, we consider player-specific congestion games on circles. Without loss of gen- 
erality, we assume that the resources are enumerated from 0, ... ,n — 1, and that they are ar- 
ranged in increasing order clockwise. Furthermore, we assume w.l.o.g. that for every player i, 
= {fi, r^+i mod n}- In the following, we call the 0- and r^+i mod n the 1-strategy of player i. 
Furthermore, we drop the mod n terms and assume that all indices are computed modulo n. Due 
to our assumptions on the delay functions, there are six different types of players in such games: 



4,(1) 
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4,+, (2) 
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We call the three other types, which can be obtained by exchanging the identities of the 
resources Vi and r,;+i in the above inequalities, type 1', type 2', and type 3'. Furthermore, we call 
two players i and j consecutive, if they share a resource, that is, if j — i + I or i = j + 1. Given 
a state 5*, we call two consecutive players synchronized, if both play the same strategy, that is, if 
both either play their 0- or their 1-strategy. Moreover, we call a set of consecutive players i, . . . ,j 
synchronized if all players play the same strategy. 

3.1 Cycles in the Transition Graphs and a Lower Bound 

We present an infinite family of games possessing cycles in their transition graphs. From this 
construction we derive a lower bound of f2(n^) on the convergence time of the random best response 
dynamics in player-specific congestion games on circles. 

Consider a game on a circle with n players which are all of type 3. It is not difficult to verify 
that this game possesses only two Nash equilibria: either all players play their 0-strategy or their 
1-strategy. Consider now a state S with the following properties: In 5' we can partition the players 
into two non-empty sets Sq and iSi of synchronized players. Players in Sq all play their 0-strategy, 
whereas players in Si all play their 1-strategy. Again, it is not difficult to verify that in every such 
state there are exactly two players who have an incentive to change their strategies. From both 
sets only the first player clockwise has an incentive to change its strategy. Thus, there exist cycles 
in the transition graphs of these games. We obtain such a cycle by selecting players from the two 
sets alternately, and letting them play best responses. 

In order to prove a lower bound on the convergence time of the random best response dynamics, 
observe that with probability 1/2 the total number of players playing their 0-strategy increases or 
decreases by one whenever a player is selected uniformly at random. After the strategy change 
either all players are synchronized, and therefore the random best response dynamics terminates, 
or again we are in a state S' with two sets of synchronized players. Observe now that this process 
is isomorphic to a random walk on a line with nodes vq, . . . ,v„. The node Vi corresponds to the 
fact that i players play their 0-strategy. As the expected time of a random walk on a line with 
n + 1 nodes to reach one of the two ends of the line is 8(n^) if the walk starts in the middle of the 
line [TT], we obtain a lower bound of n{n'^). 

Corollary 3. There exists an infinite family of instances of player- specific congestion games on 
circles with corresponding initial states such that the number of steps until the random best response 
dynamics terminates is lower bounded by 57 (n^). 
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3.2 An Upper Bound 



In this section, we present an upper bound on the convergence time of the random best response 
dynamics in player-specific congestion games on circles. We prove the following theorem which 
matches the lower bound presented in Corollary [31 

Theorem 4. In every player- specific congestion game on a circle the random best response dy- 
namics terminates after 0{n?) steps in expectation. 

The remainder of this section is organized as follows. We characterize with respect to the types 
of the players in which cases there are cycles in the transition graphs of such games. We show 
that cycles only exist if all players are of type 3 or type 3'. We analyze the convergence time 
of deterministic best response dynamics in games with acyclic transition graphs by developing a 
general framework that allows to derive potential functions from which one can easily derive upper 
bounds. Finally, we analyze the convergence time of the random best response dynamics in the 
case of games with players of type 3 or type 3'. 

3.2.1 The Impact of Type 1 Players 

First, we investigate the impact of type 1 players on the existence of cycles in the transition graphs 
and on the convergence time of the best response dynamics. We claim that games with at least 
one player of type 1 do not possess cycles in their transition graphs. Intuitively, this is true since 
every player of type 1 changes its strategy at most once, whereas in a cycle every player changes 
its strategy at least twice. 

Lemma 5. Let V be a player- specific congestion game on a circle. If there exists at least one 
player of type 1 or V , then TG{T) is acyclic. Moreover, every best response schedule terminates 
after at most 4n^ steps. 

In order to prove Lemma [51 we first prove the following one. 

Lemma 6. Let V be a player- specific congestion game on a circle whose transition graph contains 
cycles. Then every player changes its strategy at least twice in every cycle ofTGiT). 

Proof. The fact that players being involved in the cycle change their strategy an even number of 
times is obvious. Thus, it remains to show that every player changes its strategy. For contradiction, 
assume that there exists a player i and a cycle in TG'(r) such that player i does not change its 
strategy on that cycle. In this case, we could remove the player from the game, and artificially 
increase the congestion on the resource the player allocates by one. We would then obtain a 
player-specific congestion game on a tree which cannot have a cycle in the transition graph due to 
Theorem [21 □ 

Next we prove Lemma for type 1 players. The proof for type 1' players is essentially the 
same. 

Proof of Lemma\^ Without loss of generality, let player be of type 1. Then observe that player 
will never play its 1-strategy again, once it played its 0-strategy. Thus, by Lemma [HI TG{T) is 
acyclic. 

In order to prove the convergence time, observe that if we fix player to one of its strategies, 
then we obtain a player-specific congestion game on a tree. Due to Theorem [21 the convergence 
time of such games is upper bounded by 2n^. Furthermore, observe that the transition graphs of 
these games are isomorphic to disjoint subgraphs of TG{V). The first subgraph contains all nodes 
of TGiT) in which player plays its 0-strategy, the second one contains all nodes in which player 
plays its 1-strategy. Finally, as all edges between these two subgraph are directed from the second 
one to the first one, and as the maximal length of any best response sequence in each of these 
subgraphs is upper bounded by 2n^, the lemma follows. □ 
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In the following, we will assume that there exists no player of type 1 or 1', as otherwise we 
could apply Lemma [3 

3.3 A Framework to Analyze the Convergence Time 

In this section, we present a framework to analyze the convergence time of best response schedules 
in player-specific congestion games on circles. Let F be a game such that there is no player of type 
1 or 1'. First, we investigate whether there is a sufficient condition such that player i does not 
want to change its strategy in a state 5 of F. 

Observation 7. Suppose that player i is not of type 1 or 1' . Then if it is synchronized with the 
players i — 1 and i + 1 in S , it has no incentive to change its strategy. 

In the following, we call a resource r overloaded in state S if two players share r. Additionally, 
we call a resource r' underloaded in state S if no player allocates r' . Obviously in every state 
of F, the total number of overloaded resources equals the total number of underloaded resources. 
From Observation [71 we conclude that in every state S only players who allocate a resource that 
is currently overloaded or who could allocate a resource that is currently underloaded might have 
an incentive to change their strategy. 

Based on this observation, we now present a general framework to analyze the convergence time 
of best response schedules. First, we introduce the notion of over- and underload tokens. Given an 
arbitrary state S of F, we place an overload token on every overloaded resource. Additionally, we 
place an underloaded token on every underloaded resource. Obviously over- and underload tokens 
alternate on the circle. Furthermore, note that a legal placement of tokens uniquely determines 
the strategies the players play. A placement of tokens is legal if no two tokens share a resource, 
and if the tokens alternate on the circle. 

In the following, we investigate in which directions tokens move if players play best responses. 
Consider first a sequence of resources r^, . . . ,rj and assume that players i,. . . ,j — 1 are of the 
same type t. Additionally, assume that an overload token is placed on resource r^, and that an 
underload token is placed on resource r; with i < k < I < j. The scenario we consider is depicted 
in Figure [T] 

orientation of the players 



t t 

overloaded underloaded 





overload 


underload 


type 2 
type 2' 
type 3 
type 3' 


anticlockwise 
clockwise 
clockwise 
anticlockwise 


clockwise 
anticlockwise 
clockwise 
anticlockwise 



Figure 1: In which directions do the tokens move? 

Assume first, that the distance (number of edges) between the two tokens is at least two, i.e., 
|Z — fc| > 2. In this case, each token can only move in one direction. The directions are uniquely 
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determined by the type of the players. They can be derived from investigating, with respect to 
the players' type t, which players have incentives to change their strategy. The directions are 
stated in Figure [2 too. Assume now that the distance between the two tokens is one. That is, 
k = I — 1. Then, there exists a player who is interested in the over- and underloaded resource, and 
who currently allocates the overloaded one. It is not difficult to verify that this player always has 
an incentive to change its strategy. Note that this holds regardless of the player's type since we 
assumed that there are no players of type 1 and 1'. Observe that after the strategy change of this 
player all players z, . . . , j — 1 are synchronized and therefore there exist no over- and underloaded 
resources anymore. In the following, we call such an event a collision of tokens. 

So far, we considered sequences of players of the same type and observed that there is a unique 
direction in which tokens of the same kind move. In sequences with multiple types of players such 
unique directions do not exist any longer, i.e., overload as well as underload tokens can move in 
both directions. However, if two players of different types share a resource and if due to best 
responses of both players an over- or underload token moves onto this resource, then the token 
could stop there. In the following, we formalize this observation with respect to overload tokens 
and introduce the notion of termination points. 



Definition 8. We call a resource 
conditions are satisfied. 



a termination point of an overload token if the following 



1. The players i ~ 1 and i have different types. Let these types be ti_i and ti 



2. 



In sets of consecutive players of type ti^i overload tokens move clockwise, whereas they move 
anticlockwise in sets of consecutive players of type ti . 



We illustrate the definition in Figure 2(a^ 
of type 2. 



Let player i — 1 be of type 3, and let player i be 
In this case, the requirements of the definition are satisfied. Assume, that player i — 1 
plays its 1-strategy and that it is synchronized with player i — 2. Additionally, assume that player 
i plays its 0-strategy and that it is synchronized with player i -I- 1. Observe now that the token 
cannot move as neither player i — 1 nor player i has an incentive to change its strategy. Suppose 
now that initially all players along the path play their 0-strategy. Then an overload token that 
moves from the left to the right along the path stops at . The token may only move on if one of 
the two players is not synchronized with its neighbor any longer. In this case, this player always 
has an incentive to change its strategy as it can allocate a resource that is currently underloaded. 
Thus, an underload and an overload token collide. Additionally, if initially all players play their 
1-strategy and an overload token moves from the right to the left along the path, we observe the 
same phenomenon. The token cannot pass the resource unless it collides with an underload 
token. 

Note that the definition of a termination point can easily be adopted to underload tokens. A 
list of all termination points is given in Figure |2(b)[ In the left column we present all termination 
points for overload tokens, in the right one for underload tokens. 



orientation of the players 



type 3 T type 2 
overloaded 

(a) Example of a termination point 



2' 2 
3 3' 
3 2 



2' 3' 



2 2' 

3 3' 

2 3' 

3 2' 



(b) List of all termination points 
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3.4 Analyzing the Convergence Time 

In this section, we analyze the convergence time in player-specific congestion games on circles. We 
distinguish between the following four cases. 

Case 1: For both kinds of tokens there exists at least one termination point. 

Case 2: Only for one kind of tokens there exists at least one termination point. 

Case 3: There exist no termination points but over- and underload tokens move in opposite 
directions. 

Case 4: There exist no termination points and over- and underload tokens move in the same 
direction. 

In the first two cases, we present potential functions and prove that the transition graphs of 
such games are acyclic and that every best response schedule terminates after 0{n^) steps. In the 
third case, we can do slightly better and prove an upper bound of 0{n) on the convergence time. 
In all cases one can easily construct matching lower bounds. Only in the fourth case deterministic 
best response schedules can cycle. In this case, we prove that the random best response schedule 
terminates after 0{n^) steps in expectation. 

Before we take a closer look at the different cases, we discuss which games with respect to their 
players' types belong to which case. Games only with players of type 2 and 2' or only with players 
of type 3 and 3' belong to the first case. Additionally, some games with more than two types of 
players belong to this case. The second case covers all games with at least three different kinds of 
players which do not belong to the first case. Furthermore, it covers games with type 2 and type 
3 players, with type 2' and type 3' players, type 2' and type 3 players, and with type 2 and type 
3' players. Games with type 2 players only, or games with type 2' players only belong to the third 
case. Finally, games with type 3 players only and games with type 3' players only belong to the 
fourth case. These observations can easily be derived from Figure [2 (b)[ 



3.4.1 Case 1 

Lemma 9. Let T be a player- specific congestion game on a circle with termination points for both 
kinds of tokens. Then T is a potential game, and every best response schedule terminates after 
0{n^) steps. 

Proof. Let S* be a state of F and consider the mapping that maps every token in S to the next 
termination point lying in the direction in which the token moves. In the following, we define 
d{t, S) as the distance of a token t in state S to its corresponding termination point. Obviously 
d{t., S) < n. Consider now the potential function 0(5') = X^token * '^(^' ^) ^'^'-^ suppose that a player 
plays a best response. Then either one token moves closer to its termination point or two tokens 
collide. In both cases (p{S) decreases by at least 1. Thus, (j){S) strictly decreases if a player plays 
a best response and therefore, TG{T) is acyclic. Moreover, as 0(5) is upper bounded by 0{n^), 
every best response schedule terminates after O(n^) steps. □ 

3.4.2 Case 2 

Lemma 10. Let T be a player- specific congestion game on a circle with termination points only 
for one kind of token. Then T is a potential game, and every best response schedule terminates 
after 0{n^) steps. 

Proof. Without loss of generality, assume that termination points only exist for overload tokens. 
In this case, we define d{to, S) for every overload token to as in the proof of Lemma [9l For every 
underload token i„ we define d{tu, S) as follows. Let to be the first overload token lying in the 
same direction as t„ moves. 
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1. If to moves in the opposite direction than tu, then we define d(tu, S) as the distance between 
the two tokens. The distance of two tokens moving in opposite directions is defined as the 
number of moves of these tokens until they colhde. 

2. If to moves in the same directions as i„ then we define d(tu, S) as the distance between tu 
and to plus the distance between to and the first termination point at which to has to stop. 
Thus, d{tu, S) equals the maximum number of moves of these two tokens until they collide. 

Observe, that for every underload token t„, d{tu,S) < 2n. Consider, the potential function 
(/): Si X . . . X ^ N X N with (j){S) = {4>i{S), 4'2{S)), where (j}i{S) equals the total number of 
overload tokens in S and 4'2{S) equals the sum of all d{t,S) for all under- and overload tokens. 
Suppose now that a player plays a best response. Obviously if two tokens collide, then (l)i{S) 
decrease by one. Moreover, if there is no collision, then (j)2{S) decreases. Note that in the first 
case 02 may increase. This may happen if, due to the collision, S) of a remaining underload 
token tu has to be recomputed as its associated overload token has been removed. The new value 
is upper bounded by the sum of the old values of i„ and the collided underload token plus 1 . Now 
consider the lexicographic ordering <0 of the states of F with respect to </>. Let S and S' be two 
states of r. Then 



S <^S' ^ 



MS)<MS') or 
MS)=MS') and MS)<MS') ■ 



Observe that (j) strictly decreases if a player plays a best response. Thus, TG{r) is acyclic. Addi- 
tionally, observe that 0i is upper bounded by n, and that 02 is upper bounded by ri^. However, 
as 02 only increases by one when 0i decreases, we conclude that every best response schedule 
terminates after O(n^) steps. □ 



3.4.3 Case 3 

Lemma 11. Let T be a player- specific congestion game on a circle with no termination points in 
which over- and underload tokens move in opposite directions. Then T is a potential game, and 
every best response schedule terminates after 0{n) steps. 

Proof. Let S' be a state of F and consider the one-to-one mapping that maps every overload token 
to the next underload token lying in the direction in which the token moves. We define the distance 
of such a pair of tokens as the maximum number of moves of these two tokens till they collide. 

Suppose now that a player plays a best response. Then either the number of overload tokens 
or the distance between one pair of tokens decreases by one. Consider now the potential function 
0: S ^ N X N with 0(5*) = (0i(5'), 02(5*)), where (j)i{S) equals the number of overload tokens in 
S, and 02(5') equals the sum of all distances of pairs of tokens. Observe now that in the case of a 
best response, 0i either decreases by 1 or remains unchanged. In the first case, 02 may increase 
by 1. This is true as tokens from different pairs may collide. However, this can happen at most n 
times. If this happens, the remaining two tokens form a new pair whose distance equals the sum of 
the distances of the previous pairs plus 1. In the second case, 02 decreases by 1. Then by similar 
arguments as in the proof of TheoremlTOl we conclude that TG'(r) is acyclic. Finally, observe that 
01 is upper bounded by n. Moreover, 02 is upper bounded by n, too. Finally, as 02 only increases 
by one when 0i decreases, we conclude that every best response schedule terminates after 0{n) 
steps. □ 



3.4.4 Case 4 

In the following, we present a proof of the fourth case for players of type 3. By symmetry of the 
types 3 and 3', the same result holds for games with players of type 3', too. 
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Lemma 12. Let T be a player-specific congestion game on a circle in which all players are of type 
3. Then the random best response schedule terminates after 0{n'^) steps in expectation. 

Proof. In order to prove the lemma, we prove the following lemma. 

Lemma 13. In every state SofT the number of players who want to change from their 0- to their 
1-strategy equals the number of players who want to change from their 1- to their 0-strategy. 

Proof. In the following, we call a synchronized set of consecutive players maximal if the next 
players to both ends of the set play different strategies than the synchronized players. Obviously 
in every state S* of F which is not an equilibrium the number of maximal synchronized sets of 
players playing their 0-strategy equals the number of maximal synchronized sets of players playing 
their 1-strategy. 

We now prove that in every maximal synchronized set of consecutive players only the first 
player clockwise has an incentive to change its strategy. Thus, in every maximal set, there is only 
a single player who wants to change its strategy. Note that this suffices to prove the lemma. 

First, consider a maximal, synchronized subset of consecutive players Af' — {i, . . . j} which all 
play their 0-strategy. Then player i — 1 plays its 1-strategy, and therefore the players i — 1 and i 
share resource r^. In this case, player i can decrease its delay by changing to her 1-strategy. Other 
players k g A/"', fc 7^ i, do not have an incentive to change their strategy as this would increase 
their delay. 

Second, consider a maximal synchronized subset of consecutive players J\f' — {i, . . .j} which 
all play their 1-strategy. Then player i — 1 plays its 0-strategy and therefore no player currently 
allocates resource . Observe now that player i may decrease its delay by changing to its 0-strategy. 
Again, all other players k 6 A/"', k ^ i, do not have an incentive to change their strategy as this 
would increase their delay. This is especially true for the last player, who currently allocates an 
overloaded resource. □ 

Consider now the random best response schedule activating an unsatisfied player uniformly at 
random. From Lemma [13] we conclude that the total number of players playing their 0-strategy 
increases or decreases by 1 with probability 1/2. Combining this with the observation that in a 
Nash equilibrium all players play the same strategy, we conclude that the random best response 
schedule is isomorphic to a random walk on a line with n + 1 vertices. Vertex Vi corresponds to 
the fact that i players play their 0-strategy. As the time of such a random walk to reach one of 
the two ends of the line is 0{n^), the lemma follows. □ 



4 Player-specific Congestion Games on General Graphs 

In this section, we consider player-specific congestion games on general graphs and present evidence 
supporting Conjecture[T]by constructing a family of instances for which experimental results clearly 
show a super-polynomial convergence time. Our analysis of player-specific congestion games on 
circles is based on the notion of over- and underload tokens, and there is no straightforward 
extension of this notion to player-specific singleton congestion games on general graphs. The 
instances we construct have, however, the property that every resource has a fixed congestion that 
is taken in every Nash equilibrium, and we can define tokens with respect to these congestions. To 
be precise, if the congestion on a resource deviates by x from the equilibrium congestion, we place 
X overload tokens in the case a; > and we place —x underload tokens in the case x < 0. Note 
that for circles with type 3 players this definition coincides with the former definition of tokens. 

The crucial property of games on circles with type 3 players leading to polynomial convergence 
is that the number of tokens cannot increase. The instances we construct in this section are in some 
sense similar to circles with type 3 players, but we attach additional gadgets to the nodes which 
can occasionally increase the number of tokens. We start with a circle with n type 3 players and 
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replace each edge by n parallel edges. This modification allows each node to store more than one 
token of the same kind if the preferences of the players are adjusted accordingly. Other properties 
are not affected by this modification, that is, over- and underload tokens still move in the same 
direction with approximately the same speed and if an overload and an underload token meet, 
they both vanish. Each time a node contains at least two tokens of the same kind, the gadget 
attached to the node is triggered with constant probability. If a gadget is triggered, it can emit a 
new pair of overload and underload token into the circle. Usually, this new pair is stored in the 
gadget and only emitted after the triggering tokens have moved on a linear number of steps. The 
new tokens are not emitted simultaneously but the second is usually only emitted after the first 
one has moved on a linear number of steps in order to prevent the new tokens from canceling each 
other out immediately. 

Initially we introduce two overload tokens at node and two underload tokens at node n/2. The 
two overload tokens move independently through the circle starting at the same node. Typically 
they meet a couple of times before they meet the underload tokens and vanish. The same is true 
for the underload tokens as well, meaning that typically a couple of gadgets get triggered before the 
initial tokens vanish. Hence, the number of tokens has a tendency to increase. Since the triggered 
gadgets emit the stored tokens in a random order, the random process soon becomes unwieldy 
and we fail to rigorously prove that it takes super-polynomial time in expectation until all tokens 
vanish. This conjecture is, however, strongly supported by simulations. 



4.1 Our Construction 



Given rt G N we construct a player-specific congestion game r„ consisting of n gadgets Go, . . . , G„_i 
as follows. In the following, the notion of a gadget differs from the notion used in the previous 
discussion. Previously, we described how to attach gadgets to a circle in order to illustrate the 
relation to games on circles. Next gadgets are arranged on a circle. A single gadget Gi is depicted 
in Figure 2(c) It consists of 4 resources J'l.O: ■ ■ ■ j ^i,3 and 5n players. Each edge in the figure 
represents n of them. The gadgets are arranged on a circle, such that for every i the resources r^^s 
and Ti+i o coincide. Thus, for every i, 6n players are interested in ri.o and r^^a, and 2n players are 
interested in r^^i and ri.2- 

For every player who chooses between the two resources ri^k and r^j with I < k we call r^^; 
the 0-strategy and r^^fc the 1-strategy of that player. In the following, we refer by the term type j 
player to a player represented by an edge e^j . The player-specific delay functions are defined as 
follows. All players of the same type j have the same functions for the two resources they choose 
between. We define these functions in terms of a threshold t for their 0-strategies, meaning that 
the 0-strategy is a best response if and only if the total number of other players allocating the 
0-strategy resource is less or equal to the threshold t. Otherwise the 1-strategy is the best response. 



The thresholds are defined as depicted in Figure 2(d) 




ri,2 
(c) Gadget G, 



ri,3 



type 
type 1 
type 2 
type 3 
type 4 



to = Sn 
ti ^ n — 1 
t2 = 3n - 2 
^3 = n - 1 
^4 = 3n — 1 



(d) The player-specific delay func- 
tions 



Figure 2: The lower bound construction 
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In the next sections, we prove that every resource has the same congestion in every Nash 
equihbrium. We proceed with a description of how gadgets can generate new tokens. Finally, we 
present results obtained from simulations. 

4.2 Properties of Nash Equilibria 

In order to simplify our proceeding discussion, we introduce the term ^{S) E N, b <E {0,1}, to 
denote the number of type j players in gadget i who play their 6-strategy in state S. Furthermore, 
we define nij{S) — nr^-{S). In the following, let S* be a Nash equilibrium of r„. For ease of 
notation, we use cjj — c\_j{S*) and n^j = nij{S*). The following observation is true because S* 
is a Nash equilibrium. 

Observation 14. Let j G {1,3} and b G {0, 1}. Then for every < i < n the number of type 
j -players playing their b-strategy in gadget Gi in S* is uniquely determined by the number of type 
j — 1 players playing their b-strategy in gadget Gi in S* , i.e., c^j_i — c\ y 

Next, we prove that every resource has the same congestion in every Nash equilibrium. 

Lemma 15. For every Nash equilibrium S* of r„ and every < i < n, 

"i,o = 3 • rt and ni^i ~ nj.2 = n . 

Proof. First observe that for every gadget Gi, it holds 



If the first inequality were not true, then there exist type players in Gi playing their 1-strategy and 
type 4 players playing their 0-strategy. However, since 5"* is a Nash equilibrium, all type 4 players 
in Gi who play their 0-strategy are satisfied and thus n^^o !i 3n. We observe that all type players 
currently playing their 1-strategy have an incentive to change their strategy. A similar argument 
proves the second inequality. Essentially, the same arguments prove the following implications: 

c% <n ^ = 0, 

C°4<71 ^ 42 = 0. 

Now consider an arbitrary gadget Gi and let in — fci_i be the number of players from gadget Gi-i 
allocating resource r^.o- In the following, we discuss how the parameter ki-i affects the choices 
of the players in gadget Gi in the Nash equilibrium S* . We prove that the best responses of the 
players in Gi are uniquely determined by the parameter fc^^i. In order to do so, we distinguish 6 
cases. 

1. Case ki-i = 0: All type 1, type 3, and type 4 players in gadget Gi-i play their 1-strategy. Due 

to Observation[T4]we conclude that all type and type 2 players in Gi-i play their 1-strategy 
as well, and therefore the congestion on o is at most 3n. In this case, however, all type 
players in Gi_i have an incentive to play a best response. We conclude that this case does 
not appear in a Nash equilibrium. 

2. Case 1 < < n: fci_i + 1 type and + 1 type 1 players in G,; play their 0-strategy. 

The remaining players in Gi play their 1-strategy. Thus ki — -I- 1. 

3. Case fci_i = n: All type and all type 1 players in Gi play their 0-strategy; all other players 

in Gi play their 1-strategy. Thus ki — ki-i. 

4. Case n < ki^i < 2n: All type and all type 1 players in Gi play their 0-strategy. Additionally, 

— n type 4 players in Gi play their 0-strategy. The remaining players in Gi play their 
1-strategy. Thus ki — 
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5. Case 2n < < 3n: All type 0, all type 1 and all type 4 players in Gi play their 1-strategy. 

Additionally, — 2n — 1 type 3 and fci_i — 2n — 1 type 4 players in Gi play their 0-strategy. 
The remaining players in Gi play their 1-strategy. Thus ki — — 1. 

6. Case — 3n: Similar arguments as in the first case show that this case does not appear in 

a Nash equilibrium. 

As an intermediate observation we conclude that the lemma is true if at least one gadget Gi 
exists for which n < ki < 2n holds. In this case fci_i = ki for every I < i < n and the players play 
the strategies as described above. 

Next we take a closer look at the second and fifth case. We begin with the second one in which 
1 < ki-i < n implies ki = fci_i + 1 which implies fc^+i = fci_i + 2 and so on until kj — n. In 
this case we enter the third case which implies fc^+i = n and so on. Obviously this leads to a 
contradiction since ki-i < n. Thus, whenever there exists a gadget for which < n holds, S* 
is not a Nash equilibrium. Similar arguments show that the fifth case does not appear in a Nash 
equilibrimn cither. □ 



4.3 Generating New Tokens 

Motivated by Lemma fTKl we are now ready to introduce a new notion of tokens. 

Definition 16. Let S be an arbitrary state of r„ and let n* be the congestion on a resource r 
in every Nash equilibrium. Then, we place over- and underload on the resources according to the 
following rules. 

1. If nr{S) = n* + k, k £N, then we place k overload tokens on r. 

2. If nr{S) — n* — k, fc G N, then we place k underload tokens on r. 

Next we describe how the number of overload and underload tokens can increase. This can 
happen if there are either at least two overload or at least two underload tokens on o- In the 
following, we discuss the first case in detail. The second case in only depicted in Figure El 

Numbers attached to resources cor- 



Consider a single gadget Gi as depicted in Figure 4(a) 



respond to the number of tokens lying on them. Positive numbers indicate that overload tokens 
are present, negative numbers indicate that underload tokens are present. Numbers a attached to 
edges indicate that a players represented by that edge play their 0-strategy, whereas n — a players 
play their 1-strategy. 



Configuration 4(a) : Initially, there are two overload tokens on rt^. In this case, all type and 
all type 4 players have an incentive to change to their 1-strategies. All other players are 
satisfied. With probability 2/3, given that a player from Gi is selected, a type player is 
selected and the configuration |4(b)] is reached, in which there is one overload token on r^^o 
and one on r^^i. 



Configuration [4(b) | All type 1 and all type 4 players have an incentive to change to their 1- 
strategy. With probability 2/3 configuration 4(c) is reached in which there is one overload 
token on r^^o and one on r^.a. 



Configuration 4(c) : Still all type 4 players have an incentive to change to their 1-strategy. How- 
ever, we assume that the overload token which currently lies on r,;.o moves on due to a best 
response of a player in gadget Gi+i. In this case, configuration |4 (d")] is reached in which there 
is still one overload token on r^ o- Additionally, one overload token is in gadget G^+i. 



Configuration 4(d) 



Again, all type 4 players have an incentive to change to their 1-strategy. 
Now one of these players is selected and configuration 4(e) is reached in which there is one 
overload token on 4 . 
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Configuration 4(e) : In this configuration, the overload token on Vi^^ can move to the next gadget. 



Observe that this event is much more likely than the next one, in which the only type player 
playing its 1-strategy switches back to its 0-strategy. All other players are satisfied. If both 
events take place configuration |4(f )] is reached. Note, that in this case additional tokens are 
generated. There is a new underload token on r.^^i and a new overload token on r^.o- 



Configuration |4(f)| Finally, all n — 1 type 4 players playing their 0-strategy have an incentive to 
change to their 1-strategy. Additionally, the only type 1 player playing its 1-strategy wants 
to change back to its its 0-strategy. 



4.4 Simulations 

We simulated the random best response dynamics in games r„ and obtained the results shown in 
Figure [31 On the x-axis we plotted the parameter n, on the y-axis the average number of best 
responses until the random best response dynamics terminated. Observe that the y-axis is plotted 
in log-scale. For every n G {5, 10, ... , 180, 185} we started the random best response dynamics 
from the following initial configuration: all type and all type 1 players play their 0-strategies; all 
type 2 and all type 3 players play their 1-strategies. Additionally, n/2 type 4 players in the gadgets 



Go, 



7 Gn/2~1 



and n/2 + 2 type 4 players in the gadgets G„/2, . . . , G„_i play their 1-strategy. 



All other type 4 players play their 0-strategy. This initial configuration corresponds to placing two 
overload tokens on rp^o and two underload tokens on 7'„/2,o- For n < 160 we took the average over 
400 runs, and for larger n we took the average over 100 runs. 
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Figure 3: Average number of best responses 



Unfortunately, it does not seem feasible to simulate the best response dynamics for much larger 
values of n. We believe, however, that the results in Figure [3] are a clear indication for a super- 
polynomial, maybe even exponential, convergence time. 
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(a) Initial configuration. (b) One overload token detours to 

the upper path. 








(c) It continues on the upper path (d) . . . and moves to the next gad- 

get. 



-1 




(e) The second overload token (f) New tokens are generated, 

moves. 



Figure 4: The number of tokens increases along the upper path. 
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(c) The second overload token 

moves. 



Figure 5: The number of tokens 




-1 

(b) One underload token detours to 
the lower path. 








(d) . . . and moves to the next gad- 
get. 




+1 

(f) New tokens are generated. 



increases along the lower path. 
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